Machine readable documents, such as electronic documents, may be classified or otherwise processed using data contained within such documents. In order to classify or further process the documents, it may be desirable to identify meaningful content from documents.
Such meaningful content may, in some cases, include phrases located within the documents. Phrases are groups of words which function as a single unit in the syntax of a sentence within the document. Phrases may be useful in order to identify, classify, or further process the document.
Thus, there exists a need for systems which automatically identify phrases in machine readable documents.
Similar reference numerals are used in different figures to denote similar components.